Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OutputSpec for DerivationGoal and DerivedPath, today's OutputSpec -> ExtendedOutputSpec #6815

Merged
merged 11 commits into from
Jan 13, 2023

Conversation

Ericson2314
Copy link
Member

@Ericson2314 Ericson2314 commented Jul 17, 2022

DerivedPath::Built and DerivationGoal were previously using a regular std::set<std::string< with the convention that the empty set means all outputs. But it is easy to forget about this rule when processing those sets. Using OutputsSpec forces us to get it right.

I did this because of the deduplication @edolstra requested in #4543 (comment). We can't do much good code reuse when DerivedPath doesn't use OutputsSpec like it should.

We need to split OutputsSpec from ExtendedOutputsSpec because the "default" case only makes sense for high-level installables with an outputsToInstall field. For DerivationGoal, and DerivedPath itself, only concrete sets and the "all" wild-card make sense.

@Ericson2314
Copy link
Member Author

Hooray it is fixed!

@Ericson2314 Ericson2314 changed the title OutputSpec for DerivationGoal and DerivedPath, today's OutputSpec -> ExtendedOutputSpec` OutputSpec for DerivationGoal and DerivedPath, today's OutputSpec -> ExtendedOutputSpec Nov 14, 2022
@Ericson2314 Ericson2314 requested a review from edolstra November 14, 2022 20:38
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2022-11-14-nix-team-meeting-minutes-8/23452/1

@Ericson2314
Copy link
Member Author

@edolstra I rebased and in the final 4th commit cleaned up the parser, following up from #4543. This is ready for review!

@Ericson2314
Copy link
Member Author

It is possible to review this one commit-by-commit, and that might make it easier.

@Ericson2314 Ericson2314 force-pushed the better-wanted-outputs branch from f41e2b2 to 5cdd58a Compare December 13, 2022 18:36
@@ -662,7 +664,9 @@ std::tuple<std::string, FlakeRef, InstallableValue::DerivationInfo> InstallableF
priority = aPriority->getInt();
}

if (outputsToInstall.empty() || std::get_if<AllOutputs>(&extendedOutputsSpec)) {
auto * outputsSpec = std::get_if<ExtendedOutputsSpec::Explicit>(&extendedOutputsSpec);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote this logic to hopefully be much easier to read. There was some stuff that just seemed redundant? I deleted it but do check.

[&](const OutputsSpec::Names & names) {
return names;
},
}, wantedOutputs.raw());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very similar to the code above (realWantedOutputs), so maybe this could be merged into a method on OutputSpec, e.g. outputSpec.names_or(...).

auto wantedOutputsLeft = wantedOutputs;
auto wantedOutputsLeft = std::visit(overloaded {
[&](const OutputsSpec::All &) {
return StringSet {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

presumably this is to minimize changes to the rest of this function. Should there be a TODO here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hehe I thought the same, but I think this is actually the natural way for this one specific check --- which is "did we refer to any outputs that don't exist?"

@@ -301,4 +301,49 @@ std::map<DrvOutput, StorePath> drvOutputReferences(
return drvOutputReferences(Realisation::closure(store, inputRealisations), info->references);
}

OutputPathMap resolveDerivedPath(Store & store, const DerivedPath::Built & bfd, Store * evalStore0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename evalStore0 to evalStore_. That's what we usually do for shadowed names.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually why does evalStore need to be passed as a pointer instead of a reference?

Copy link
Member Author

@Ericson2314 Ericson2314 Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did the rename, I will try to keep that pattern in mind going forward! :)

Responding to the question: In the header it's has a default argument: evalStore = nullptr. I would like to do evalStore = *this and make it a reference but I don't think that is valid C++?

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2022-12-12-nix-team-meeting-minutes-16/24119/1

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2022-12-19-nix-team-meeting-minutes-18/24121/1

@Ericson2314 Ericson2314 force-pushed the better-wanted-outputs branch from 5cdd58a to 0cdbcfc Compare December 20, 2022 21:34
Ericson2314 added a commit to obsidiansystems/nix that referenced this pull request Jan 10, 2023
- Add a comment

- Put `OutputsSpec` in a different header (First part of NixOS#6815)

- Make a few stray uses of it in new code use `DerivedPath` instead.
@Ericson2314
Copy link
Member Author

I taught myself in #7543 how to write the tests, I'll return to this once that one is merge (they overlap slightly, so conflicts either way).

@Ericson2314 Ericson2314 force-pushed the better-wanted-outputs branch 2 times, most recently from a667ac4 to 98a42ef Compare January 11, 2023 16:02
#include "nlohmann/json_fwd.hpp"

namespace nix {

typedef std::set<std::string> OutputNames;
struct OutputNames : std::set<std::string> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the STL container classes are not meant to be inherited from. Please don't.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a problem when it's a "newtype" like class --- exactly one superclass/field and so hopefully the same representation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... I see that you try to make it harder to create an empty set.
Unfortunately I wouldn't deem this approach to be very effective since you can still write auto myEmptyInstance = OutputNames({}); and it does not even look wrong, because the typename OutputNames does not convey that it's meant to never be empty. For this reason, one still has to put the same effort into the review to find out if we really never construct non-empty sets.

From a design perspective, this is not a case for inheritance.
I suggest to either drop this subclass which is from a design perspective more harm than good, or implement a class template <typename T> class NonEmptySet that has an std::set and that really implements the semantics of a non-empty data structure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, there was one "builder pattern" type situation where I did need to start with the empty set as an identity for merging proper ones in. So I dont really mind that there is an escape hatch. Of course yes the best thing to do would be to use a separate type for this one instance and make it ironclad.

A few little changes preparing for the rest.
Do this prior to making a new more limitted `OutputPath` we will use in
more places.
`DerivedPath::Built` and `DerivationGoal` were previously using a
regular set with the convention that the empty set means all outputs.
But it is easy to forget about this rule when processing those sets.
Using `OutputSpec` forces us to get it right.
This should be a non-empty set, and so we don't want people doing this
by accident. We remove the zero-0 constructor with a little inheritance
trickery.
This forces us to be explicit.

It also requires to rework how `from_json` works. A `JSON_IMPL` is added
to assist with this.
@Ericson2314 Ericson2314 force-pushed the better-wanted-outputs branch from c62a8fb to 0faf532 Compare January 12, 2023 00:16
};

struct DefaultOutputs {
bool operator < (const DefaultOutputs & _) const { return false; }
struct AllOutputs : std::monostate { };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also suggest to not subclass this standard class. Looking over the code, it does not seem to bring much advantage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://en.cppreference.com/w/cpp/utility/variant/monostate It gives me all the comparison functions and other such things I need.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like that would be an appropriate use of monostate. It is a small class with a clear purpose, and by subclassing we make its meaning explicit.

What we're really doing here is declaring an algebraic data type, in a verbose, C++ way. Maybe that's why I don't see a problem, or may that's why there won't be a problem. Or maybe I'm misreading @tfc who isn't saying that this will be a problem per se and it isn't quite related to the comment about subclassing stl containers.


typedef std::variant<AllOutputs, OutputNames> _OutputsSpecRaw;

struct OutputsSpec : _OutputsSpecRaw {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here we have another subclass of STL standard classes... please, please, don't.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #7479 we have an issue discussing this pattern. I am for it as the best that can be done for Sum types until we switch to Rust.

};

typedef std::variant<DefaultOutputs, AllOutputs, OutputNames> OutputsSpec;
struct DefaultOutputs : std::monostate { };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same pattern here. I have seen many code bases where developers subclassed stl classes - none of them aged well.

Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good, especially with the new tests, thanks @Ericson2314!
I'm also approving of using the established algebraic data type pattern, considering that it is better to discuss it as part of #7479. This new usage of it would not constitute any significant tech debt, as already a number of ADTs would have to be migrated to a different pattern.

};

struct DefaultOutputs {
bool operator < (const DefaultOutputs & _) const { return false; }
struct AllOutputs : std::monostate { };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like that would be an appropriate use of monostate. It is a small class with a clear purpose, and by subclassing we make its meaning explicit.

What we're really doing here is declaring an algebraic data type, in a verbose, C++ way. Maybe that's why I don't see a problem, or may that's why there won't be a problem. Or maybe I'm misreading @tfc who isn't saying that this will be a problem per se and it isn't quite related to the comment about subclassing stl containers.

@roberth roberth merged commit d21f549 into NixOS:master Jan 13, 2023
@Ericson2314 Ericson2314 deleted the better-wanted-outputs branch January 13, 2023 15:03
@roberth
Copy link
Member

roberth commented Jan 13, 2023

Merged to unblock further work. The remaining discussion wasn't critical and can be continued in #7479

@edolstra
Copy link
Member

@Ericson2314 Looks like this broke build.aarch64-linux, see https://hydra.nixos.org/build/205195243.

@Ericson2314
Copy link
Member Author

@edolstra @roberth told me about the problem and I made #7604 which I hope fixes it? Though I wasn't able to test it.

@roberth
Copy link
Member

roberth commented Jan 21, 2023

I suspect that this PR has caused the regression:

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nix-team-report-2022-10-2023-03/27486/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

8 participants